Skip to content

Conversation

changhiskhan
Copy link
Contributor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, did you test the performance impact of this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just returning a DatetimeIndex instead of a raw numpy array in this particular case

@wesm
Copy link
Member

wesm commented Dec 19, 2012

Boxing + dtype=object hash table pass (vs. int64) --> 300x slowdown. we can make this into a vbenchmark

In [48]: rng = date_range('1/1/2000', periods=10000, freq='T')

In [49]: arr = rng.repeat(10)

In [50]: timeit arr.unique()
1000 loops, best of 3: 1.11 ms per loop

In [51]: timeit arr.asobject.unique()
1 loops, best of 3: 320 ms per loop

@wesm wesm closed this Dec 28, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants